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Annual Forecast for the Coming Decade in Computing 


Our covers contain the 4th of our annual sets of 
predictions. How have we done? 


It has been observed that short-range predictions 
tend to be optimistic; that is, the time passes quickly and 
nothing happens that was supposed to happen. Long-range 
predictions tend to be pessimistic; the predicted event, 
if it comes, comes much sooner. The cross-over between 
these two is about six years. It is too early to tell which 
was we tend in our predictions. 


In 1976 we predicted that there would be 250,000 
personal computers by early 1980; we are now modifying that 
prediction. Some of our predictions have already come 
true: PL/I is dying rapidly; we are probably now in the 5th 
generation of machines; a new largest prime was discovered; 
automobile computers are being installed. Dome 


Some have been dead wrong: no large mainframe maker 
has dropped out recently; the pocket computer has not 


appeared (and probably won't). And some (like the one 
@ year ago regarding TI's entry into the personal computing @ 
field) were just slightly off in their timing. 


It is too early to try to keep a boxscore; perhaps a 
tally of successes and failures will be feasible after ten 
years. Any fool can make accurate predictions simply by 
making a great many of them; the success rate must include 
the booboos. Then, too, there is an analogy to the work 
of the weather bureau; when they predict rain, does one drop 
constitute success for them, or must there be some bridges 
washed out? When the time comes, we will persuade some 


continues...join in if you care to. 


outsider to rate us objectively. Meanwhile, the fun eal 
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e Eight Is Not Enough 


by David Babcock 


Flowchart R shows the necessary and sufficient logic 
to put 3 things (A, B, and C) into ascending order by direct 
internal sorting. The 3 comparisons and interchanges 
shown are not a unique set; the same result could be obtained 
from the comparisons of A to C, B to C, and A to B, in that 


order. 


For brevity, let us symbolize all the logic shown in 


With that notation, the 


the circle of Figure R as BC. 
AB, 


basic scheme for a 3-item sort is then AB BC 


Unlike other sorting methods where the sequence of 
comparisons is directed by the outcome of previous comparisons, 
direct internal sorting is based on a homogeneous sequence 

e@ of comparisons. That 1s, a predefined set of comparisons 
is "blindly" applied without regard to any previous inter- 
changes. Such a constrained method of sorting is important 
because of the ease with which it can be implemented in 


hardware. 
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3 

Ks AB BC cD AB BC AB 

53 AB BC CD ODE AB BC CD AB BC 

6: AB BC cD DE EF AB BC cD DE 
cD AB 


AB 
AB BC 
BC AB 


(1.e., case N is formed from case N-l by preceding the 


pattern by all possible comparisons from left to 


The basic scheme shown in the flowchart for 
sort extends readily to the sorting of any number 
as shown in Figure S. However, it 1s observed 
Simple scheme 1s not the most efficient possible. 
example, 4 items can be sorted with the scheme: 


AB CD AC BD BC 


which leads us to the following state of affairs: 


right.) 


a 3-item 

of items, 

that this 
For 


Number of Number of Number of 
things comparisons comparisons 
to sort in theory actually needed 


Thus, attention is drawn to the case N = 5. 
is: What is the minimum number of comparisons 
changes needed to sort 5 things? 


(Various interesting theoretical approaches suggested 


The problem 
and inter- 


that the number of comparisons needed should be 8, which 
might be a satisfactory solution except that then the 


burning question is which 87) 


I set out to establish the answer to the problem, 
using a brute force approach; that is, by trying every 
possible combination of 8 comparisons to sift out the 
combination that would work. 


Note: while having all the fun of coding and running 


this monumental search, I discovered that Knuth had already 
solved the problem* and that the answer is 9. 


However, there is some point to reporting on the 
exhaustive attack, and a side effect popped up that may 
be of more value than the original research. 


We need some notation. We can label the positions 
to be sorted as follows: 


arene ede 


And the ten possible comparisons can be coded this way: 


0) AB 5 BD 
1 AC 6 BE 
2 AD 7 cD 
3 AE 8 CE 
4 BC 9 DE 


We can form a systematic scheme for testing all combinations: 
ae 3h Snow 277 88 


aoa aia 


Each of these boxes represents one of the 8 comparisons 

to be applied to the 5 sort positions. Comparisons are to 

be applied in order (left to right). Each box can thus 

take on the values from O to 9, taken from the coding scheme 
given above. To test all possible combinations systematically, 
run the 8 boxes as an 8-digit decimal counter. This implies 
that there are 100,000,000 combinations to test, and for each 
such combination we must try the 120 permutations of 5 things, 
to insure that each of them is put into ascending order. 


*This phenomenon--namely, discovering that Knuth has 
anticipated what you have just discovered--is becoming 
increasingly frequent, and will become even more so when 

the remaining four volumes of The Art of Computer Programming 
appear. For the result we need here, see Section 

5.3.4 of Knuth. 
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Side note: There is no need to additionally test the 


7-comparison combinations. If such a set of 
7-comparisons existed, it would have been found 
while testing 8-comparison combinations. For 


example, if the first 7 comparisons had done all 
the work, then it wouldn't matter what the 8th 
comparison was, and 10 workable solutions would 
have been found. In all, (at least) 80 sets 

of level 8 solutions would have been found for 
each level 7 solution. 


In order to decrease the amount of computer time 
needed to run the program, many improvements were made, 
both to the method of solution and to the code itself. 
These efficiencies can be divided roughly into four areas. 


Improvement #1: 


When testing a set of comparisons, not all the 
120 permutations of 5 items need to be tried. 

As soon as a trial set of comparisons fails to 
order one of the permutations, no further testing 
needs to be done with that set. Only a "good" 
set of comparisons needs to be applied to all 

of the possible permutations (and then to only 
119 of them, since the arrangement 12345 will 
always be in sort). 


The first improvement, then, is to order the 
permutations in such a way that the permutation 
in "worst sort" is tested first; the next worst 
second; and so on. By doing this, most of the 
bad sets of comparisons will be eliminated by 
performing only one sort operation. This 

is a significant improvement to the solution 
(in the production run, 89% of the cases were 
eliminated on the first sort). 


To implement this improvement requires that some 
measure of "out-of-sort"-ness* be applied to 
each of the 120 permutations. The permutations 
ean then be ranked according to this measure. 
This was done and Table 1 lists the 120 permuta- 
tions in the order in which they were tested 

in the production run. 


*As it turns out, the development of a good measure of 
"out-of-sort"-ness is actually more interesting and complex 
than the original problem I set out to solve. More on 
this later. 


Permutations (listed in order tested) 


2 53421 31 8635124 61 51324 Ql 25341 
2 54231 32 8641523 62 1243 92 = 2314 
Bee p23. 33 23451 63 3215 g 51342 
RH 53412 340 451234 64 25314 g 32541 
ben 43521 35 23514 65 41352 95 52143 
6 35421 36 =. 25134 66 15432 96 §=32415 
g 54213 37 = 3452 67 34215 97 24315 
54132 38 41253 68 = 32514 98 42135 
9 34521 39 = 24153 69 42153 99 —- 41325 
10 8©©45213 4o) 838-3524 7O 25143 100 14352 
11 43512 41 = 23154 71 843125 101 13542 
12 35412 42 21453 72 31542 102 = 15324 
13 «45132 43 21534 73 14532 103 15243 
We -454123 yy 32.254 Th 15423 104 23245 
15 34512 45 54321 7 34125 105 31245 
16 §=45123 46 =. 45321 76 14523 106 813425 
17s 443251 47 =—-454312 77 23415 107 14235 
18 53214 48 =. 45312 78 13452 108 = 12453 
19 = 25431 4g = 53242 79 4123 109 =: 12534 
20 851432 50 38652431 80 1523 
21 34251 51 9.35241 81 32154 11100-42315 
22 24531 52 42531 82 24135 112 »§=615342 
23 «53124 53 52413 8321543 1130-32145 
24 = 51423 54 453142 84 31425 114 8 §=14325 
25 35214 55 42513 85 13524 115 = 12543 
26 8 §=6©25413 56 35142 86 =: 14253 116 =. 21345 
27 =43152 57 32451 87 = 21435 117s - 13245 
28 = 41532 58 24351 88 21354 118 1243 


29 =. 24513 59 =. 23541 8913254 11 12 
30 34152 60 52134 90 2351 120. (12345) 


Table 1 


Improvement #2: 


Any comparison which duplicates the work of 
either of its nearest neighbors is doing no 
useful work. In other words, there is no need 
to try any combination where two adjacent digits 
are the same. 


PE(3-30 
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Note that two subcases need to be considered here: 
1. If the sorting can be done in 8 (and 
no fewer) comparisons, then the above 
explanation holds directly. 


2. If the sorting can be done in 7 
comparisons, then we have succeeded in 
discarding some of our good cases, but 
not all of them. We'll still get (at 
least) 73 sets of level 8 solutions for 
each level 7 solution which exists. 


To implement this, one additional check needs 
to be made. When changing comparison K, if 
the new value equals comparison (K-1), then 
change comparison K again. Picture an 
analogy with an odometer. When a wheel is 
rotated one position, if its "value" is equal 
to the value of the wheel to its left, then 
rotate it again. 


To compute the number of comparisons which will 
now be tested: 


ee ese ies 64. 7 8 


: Br ieye & é 5 @) 0 number of values 
oe <7 9° 9 9 pt ie which do not 
equal the 
comparison to 
its left. 


= 10-9! = 47,829,690 combinations, or less than 
half the number we started with. 


Improvement #3: 


Every one of the 5 number positions must be 
tested at least once. Any set of 8 comparisons 
which fails to test any one of the 5 positions 
ean be discarded immediately without performing 
any sorts. If the every-position-tested test 
can be performed cheap enough, we may come out 
ahead with it. (Note that if this test is 
not made, all of the cases it would have 
rejected would have been eliminated by the first 
sorting case, since that case has every position 
out of place.) 


This test was implemented in the production run, 
but it turned out to cost more than it gained, 
because so few cases were eliminated by it. The 
final production run of about 10 hours would have 
been 30 minutes shorter without this test. 


Improvement #4: 


The remaining improvements fall into the 
category of clever coding to make obvious 
improvements in running time of the production 
program. For example, running the 8-digit 
counter backwards, from 999999999 to OO000000, 
is faster because it is simpler to test for 
zero than to test for greater than 9. 
Extensive use was made of straight-line coding 
in the sort and the test-if-sorted codes to 
eliminate loop control time. 


I coded the program both in Apple (6502) 
assembly language and COMPASS (assembly language 
for the Control Data 3170). In both cases, 

I took all possible shortcuts for speed. From 
timed test runs, it became clear that the 

CDC 3170 would take 50% longer to run than the 
Apple II. 


It should be pointed out that additional 
improvements (such as refining test #2 above) 
could have been made to the method of solution 
but were not, because most such improvements 
would have cost more execution time (due to 
the complexity of making the test) than they 
would have saved. 


One Improvement which would have helped greatly 
(giving a 30% savings) and cost nothing to 
implement wasn't made because it wasn't noticed 
until after the production run was finished. 

It is this: comparison number one needs only 
to be tested up through 6 (rather than 9) due 
to the symmetry of the problem. This example 


serves to reinforce the old adage that a solution 


always exists which is better than the one you 
just used. 


As already mentioned, the production program was 
coded in 6502 assembly language and executed in 10 hours 
on an Apple II computer. This run confirmed by exhaustive 
search that there is no set of 8 comparisons that will 
always properly sort 5 items. Flowchart T gives the 
overall logic for the program, with improvement #3 removed. 
Also not shown on the flowchart are a number of counters, 
which were inserted at key points in the code to gather 
statistics about the run. These statistics are given in 


Table 2. 
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Set PERM, = 
453421, 54231, 


31, ...0. ’ 


45231 
12354, 12345 


Set COMP, = 


9,8, 9,8, 9; 
8,9,8 


Sort TEMP 
using COMP 
set of 

comparisons 


Sort yes ‘| Set ee 


successful? 


STOP 


(Success) 


/Continue \ 


at Reference 
5, next page 


In discussions of sorting, it 1s frequently assumed 
that data in strict descending order constitutes the worst 
case for a sorting scheme. This 1s sometimes true (it 
certainly is for bubble sorting) but not always. For 
example, consider the Shell sorting scheme (the logic was 
given in our issue 58, page 6). In outlining the method 
of Shell sorting, the steps were given for sorting eleven 
items that were in descending order. For that set of 
data, the sort makes 15 interchanges. But the set: 


LOM IS 9. 8° 7 95 266 4 Se 1 


takes 17 interchanges, and is thus a worse case for Shell 
sorting. 
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Oo SION UI Ser i ee 


WW Ww Ww wo HO DY WH NYO YH HY F 
Hm FW WD WH F FF OO OO DON 


Number of cases which failed at each level* 


40, 367,418 
2,943,906 
1,044,582 

571,746 
168, 043 
182, 240 
74,978 
59,779 
34,143 
9,568 
8, 863 
12, 244 
2,945 
2,419 
1,607 
1,607 
1,443 
883 

718 

648 


That is, permutation number 

one (53421) caused 40, 367,418 
combinations to fail and they 
didn't have to be tested any 
further. Of those combinations 
which passed sorting permutation 
number one, 2,943,906 failed on 
permutation number two (54231) 


and so on. 


The number of combinations eliminated 
for not testing all positions: 
2,339,910. 


Average number of levels tested: 


1 .23338873 


*A1l other permutations had a 


count of zero. 


If reverse order isn't necessarily the worst case 
for testing sort algorithms, then what is and how do we find 
At? A popular method for measuring the sortedness of a 
set of numbers is to count the number of "inversions. 
The concept is attributed to G@. Cramer and is described by 
Knuth (in Section 5.1.1): “each inversion is a pair of 
elements that is ‘out of sort'." For example, the set 
of numbers: 


24153 
has four inversions; namely, 


(2,1) (4,1) (4,3) (5,3). 


Table 3 summarizes some key permutations and their 
number of inversions. It is clear after studying the 
table that using the inversion count, as defined, will not 
produce the optimal ordering of the permutations. The 
permutation with the largest number of inversions is 54321 
and this is clearly not the worst case. With the center 
element already in its proper place, many trial sets of 
comparisons will "succeed" in sorting this permutation when 
they shouldn't. Remember that for the problem at hand, 
the worst case permutation is that arrangement which will 


eliminate the greatest number of faulty sort seta. Another 


approach to measuring “out-of-sort"-ness was needed. 


Permutations and their Inversion Count 


54321 
53421 
34521 
52341 
25431 
35241 
52324 
34512 
15432 

1542 

2135 
31245 
12435 
12345 


ro) 


OPN FIANNA AN OW O 


Table 3 
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The approach I took was to formulate my own measure 
which I call a displacement coefficient. To order the 
permutations we do the following: 


1) Group the arrangements according to the 
number of values which are out of place. 
These groupings are indicated by the 
solid lines in Table 1. (Note that this 
could also have been done when using 
inversion counts to produce a better 
ordering.) 


2) Within each group, rank the permutations 
according to the following formula: 


n 


2 (This assumes 

ches ) (a, - 1) that the a 
values range 
i=l from 1 to n) 


The reasoning behind the displacement coefficient is 
similar to that for the variance of a set of numbers. That 
is, the further out of position a number is, the "harder" 
it is for a sort algorithm to put it back where it belongs. 


From preliminary tests of the displacement coefficient, 
the inversion count and several other measures of sortedness, 
it appeared that the displacement coefficient did the most 
effective job of ordering the permutations for this problen. 
Its ordering was therefore used in the production run. 


Te statistics gathered from the production run 
(Table 2) show several interesting things: 


1) The basic ordering of the permutations is 
very good. Permutations 5 and 6 both 
have displacement coefficient values of 34 
so their final ordering was arbitrary. 
Similarly, permutations 18, 19, and 20 all 
have values of 28 and could have been put 
in a different order. 


2) The gaps of permutations with zero counts 
seems to indicate that these arrangements 
are in some manner subsets of previous 
permutations. If permutations 9-16, 22, 
23, 27-32 were discarded, almost 700,000 
needless sort operations would have been 
saved. This strongly suggests that some 
type of pattern matching, applied to the 
120 permutations to identify these subset 
cases, would be beneficial. 


3) While better (at least for this problem) 
than the other measures of "out-of-sort"- 
ness, the displacement coefficient is not as 
good as it could be. The program was 
modified so that each set of sort comparisons 
would be tried on each of the first ten 
permutations regardless of previous failures. 
The results of this run are given in Table 4. 
From these statistics it appears that 45231 
(not 53421) is the worst ordering of five 
values. The best formula for measuring 
sortedness should therefore place 45231 at 
the top of the list. 
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The investigation into other measures of sortedness 
is an area which requires more work. The research 
presented here has suggested an approach which needs to 
be pursued further. It is apparent that there is at least 
one category of sorting problems for which existing 
algorithms are not appropriate. 


Number of cases which fail to sort each of the 


first ten permutations. 


1 42,707, 328 


Table 4 
2 42,707, 328 


3 43,147,474 
That is, of the 45,489,780 


4 43,147,474 combinations tested, 42,707,328 


failed to sort permutation number 


5 43,089,794 
one; 42,707,328 failed to sort 


6 43,089,794 permutation number two, etc. 


( 43,089,794 
This table suggests that even my 
initial formulation of the 
8 43,089, 794 displacement coefficient is not 
optimal. Permutations number 
three and four seem to be the 
9 42,879,276 most sensitive for testing 
sorting schemes, and should be OOo 
at the top of the list. Ooo 
10 42,852,480 
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Andree's Problem 


(The following problem comes from Prof. Richard 
Andree, the genial computing sage at the University of 
Oklahoma. ) 


It has been proved that there exist arbitrarily long 
strings of consecutive positive integers such that each of 
the consecutive integers has as a factor a perfect square 
greater than one (probably different squares for different 
integers). Putting it in more formal terms: 


Given a positive integer N, there exist strings 

of N consecutive positive integers each of & 
which contains a factor which is a perfect 

square greater than one. 


If N = 3, the consecutive numbers: 48 49 50 


4e 7° - 


contain as factors: 
98 SH) 100 


72a ka ace 


For N = 4 we have, for example: 242 243 24k 2h5 
2 2 
aa 3 2 a 
And, for N= 7: 


217070 217071 217072 217073 217074 217075 217076 


7* 3° ae 13° 11° 5 42 


Your preliminary problem is to find the first (that is, 
involving the smallest integers) string of N consecutive 
integers each of which contains a square greater than one PP 
as a factor, for N = 2, 3,...,10. 


The proof has, in fact, been generalized for higher 
powers than 2, so that we have: 


Given positive integers N and K, there exist 
strings of N consecutive positive integers 
each of which contains a perfect Kth power 
greater than one as a factor. 


Your second problem is to find the smallest such strings 

for K = 2, 3, 4, and 5 and N = 2, 3, 4,...,10 (this is 

a total of 36 strings). 

Here are a few more results, for checking purposes; the 
power factor is given below each of the consecutive integers: 


N = 2: 80 81 


tos 8 eT 

g 

9 «ON =93x. 1375 1376 1377 
mM 

a 325 8 27 

lon) 


N= 4: 22624 22625 22626 22627 


8 125 27 331 
E N= 2: 80 81 
Bee g 16 981 
joy it 
gx | N = 3: 33614 33615 33616 
* ; 2401 81 16 


The search for these strings is an interesting project 
which will bring out the advantages of good programming 
practices and also the advantage of a compiler versus an 
interpretive system on long running problems. 


Any gauche (but valid) program will smoke out the first few 
strings at various K levels, but it soon becomes clear that 
a very small amount of programming sense will cut the total 
time sharply--and then that even the improved algorithm 

may still make an unreasonable demand for computer time 
unless a compiler or assembler program is substituted for 
an interpreter (which most BASICs are). 


TRY IT--you'll enjoy the challenge. 
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Penny Flipping Again 


In issue 71 we returned to the first of the Penny 
Flipping problems: 


A pile of C pennies is arranged so that each penny is 
heads up. We define an operation FLIP(Q) on this 
pile, which removes the top sub-pile of Q pennies, 
turns the sub-pile upside down and replaces them on 
top of the C - Q pennies remaining. The 
consecutive operations: 


FLIP(1),FLIP(2),...,FLIP(C),FLIP(1),FLIP(2),... 


are repeated until the stack returns to all heads. 
The value of N for a given C is the number of flips 
required to return the stack to heads. For C = 6, 
for example, N = 35. 


with results for the cases C = 1 through C = 240 and a 
conjecture that the value of N (flips) for any given 
value of C was one of 


2 
kc, a. Cc -1, kC - il. 


We hear now from another genial sage, David E. 
Ferguson, President of the Ferguson Tool Company, who 
points out: 


"If nis the smallest positive integer such that 


2" = + 1 (mod 2C+)) (for C coins) 
then if 2" = 1 (mod 2C+l), the number of flips is 
f(Cc) = nc 
while if 2° = -1 (mod 2C+1), 


f(c) = nC - 1. 
Note that n divides (2C+1)." 


Mr. Ferguson's observation does not seem to help 
particularly in finding N for a given C, but it does 
confirm the conjecture. 


Cut on column 38 Fold up on 
& columns 67 and 72 
Discard this 
part of card 


D0000000090t0 CO000TIHGCEINSSOORODRI0NSS SA lie Mad Be aban Roem IES A ts 
Sr, tS SY Ca er Se Ske emer eee ees Ce OM) SS ee es Pe LE ee Pa ea ee a dees eo eee ee Se | 
VuTTiTiist BESERRERRERERRLEERERELELEE TETTILTEtt TETEREREDERER ERE! ttt Vtitin 
POmCRCRCtCEC MaDIZIZeAre Ca ibe a Laake eedoegeyer crc oegedegee cae codacid dad iicicie cppeane cea) caaereel 
Siew demieas 3 odG 3923992 2333929993335 93993993935 399 52 199293 13S sis dass ais se 
CCC CC eC ee ee ee eee ee ee eee © eee Boe ee lee fe oir 
DEVI IIMO CISTI ITVS TCS oOo FP II ND SOO NN ONS SSS 555 75 gs ao vga nS IN ae gy oo ooo 
SSEGSESESEESSHSHSHSESHESSSESTSE SHS SnOHSSEPSERGHEESESISISESSSIGSESSE RS SS PSSST ERSTE RE ESS 
En at a a SO Pe V0 
| BEREZHBEERISEREEARISSIERZEFRRABRLEITRPBEESELETSSROK IS ARISE: 2GLBGSRL EESTI IIB EEE TS 


8 GTS WR) RR ee ee ee es 80s 3435 299£999939999999SS9g9SsSgghKS9991 99999499 
ees q | S67 asc 


REM Ss 


Gi). 2829 oF 334 383 7 4244 45 A 40 28495950 S255 54 9595 ws Th SAY STE ROE TTEETES EL ye Ne 


Labels for cassettes in Norelco-type boxes. 
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